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Abstract 

Attractors in asymmetric neural networks with deterministic parallel dynamics 
present a "chaotic" regime at symmetry rj < 0.5 where the average length of the 
cycles increases exponentially with system size, and an oscillatory regime at high sym- 
metry, where the average length of the cycles is 2 jlOfl . We show, both with analytic 
arguments and numerically, that there is a sharp transition, at a critical symmetry 
7] c = 0.33, between a phase where the typical cycles have length 2 and basins of at- 
traction of vanishing weight and a phase where the typical cycles are exponentially 
long with system size, and the weights of their attraction basins are distributed as in 
a Random Map with reversal symmetry. The time-scale after which cycles are reached 
grows exponentially with system size N, and the exponent vanishes in the symmetric 
limit, where T oc iV 2 / 3 . The transition can be related to the dynamics of the infinite 
system (where cycles are never reached), using the closing probabilities as a tool. 

We also study the relaxation of the function E{t) = —l/NJ2i\hi(t)\, where hi 
is the local field experienced by the neuron i. In the symmetric system, it plays 
the role of a Ljapunov function which drives the system towards its minima through 
steepest descent. This interpretation survives, even if only on the average, also for small 
asymmetry. This acts like an effective temperature: the larger is the asymmetry, the 
faster is the relaxation, and the higher is the asymptotic value reached. E reachs very 
deep minima in the fixed points of the dynamics, which are reached with vanishing 
probability, and attains a larger value on the typical attractors, which are cycles of 
length 2. 



1 Introduction 



The dynamics of Attractor Neural Networks with randomly distributed synaptic couplings 
has been investigated in several works in recent years, with particular attention to the 



effects of the asymmetry of the synaptic couplings 0, % |, g, |, 0, §, g, |HJ, ^ [12], [TJ, . 
This generalization is interesting not only because in the brain synapses are not symmetric, 
but also because it introduces new complex features in the dynamics of such systems. At 
symmetry higher than a critical value the system does never completely lose memory of its 
initial state ffj, |[| [12| , while at zero symmetry the dynamics have the essential characteristics 
of chaos in continuous systems even if the model has a finite state space |13| (in this case it 
has been shown analytically that chaos is suppressed by strong enough thermic noise |15||). 

At low symmetry the average length of the attractors increases exponentially with the size 
of the system, while for high symmetry the typical attractors are cycles of length 2. This kind 



of transition takes place also in other discrete disordered dynamical systems [|T6| . The model 
that we consider here is very simple, and a detailed study of this transition is possible. 
Moreover, the study of asymmetry is interesting in the context of neural networks, both 
because it drops the unrealistic requirement of symmetric synaptic couplings formulated in 
the classical Hopfield model [|18|] and because the existence of chaotic attractors can be a way 



to distinguish between a network converging to a regular attractor since it has remembered a 
pattern and a network which is in a confused state Q. On the other hand, the model that we 
study can also be seen as a modification of the mean field model of Spin Glass proposed by 
Sherrington and Kirkpatrick [pO , at temperature T = (if the couplings are not symmetric 



the system is not Hamiltonian, but also in the case of symmetric couplings the Hamiltonian 
of the system is different from the SK Hamiltonian due to the parallel dynamics, see below). 

The model that we consider consists in iV Ising neurons Oi = ±1 which are updated 
simultaneously according to deterministic rules: 

( x l (t + l) = sign^J^(t)j. (1) 

The Jij (synaptic couplings) are quenched random variables chosen from a distribution 
with average value zero and variance 1/N. The exact form of the distribution is irrelevant 
in the infinite size limit, so we chose a two- valued distribution which is easiest to implement 
numerically [|§. The degree of symmetry of the synaptic couplings is parameterized by r/, 
which measures the correlation between and Jjf. 

If the limit iV — > oo is taken before the limit t — > oo, it is possible to apply a Monte- 
Carlo method that reproduces exactly the dynamics of a single spin in the infinite system 



17]. This approach [12] and previous numerical works |6], [| show the presence of a phase 
transition at t)rm = 0.825. At symmetry larger than tjrm a system that was initially in a 
magnetized state does never forget the initial configuration, and the remanent magnetization, 
moo — lim^oo C(0, I), where C is the correlation function, is different from zero. At low 
symmetry the initial condition is completely forgotten. 

If the limit t — > oo is taken in a finite system, a different situation is found. Since 
configuration space is finite and the dynamics is deterministic, after a transient time the 
motion takes place on periodic orbits. The length of such orbits, the size of their attraction 
basins and the number of orbits are random variables, depending on the realization of the 



couplings Jij. We are interested in their scaling behavior as a function of system size, N. 
We can wonder what is the relation of this situation with the dynamics of an infinite system, 
where attractors do not exist. We will show that the properties of the attractors can be 
deduced from the statistical properties of the correlation function in the infinite system. 
If the synaptic couplings are symmetric^ (i.e., = Jji), one can define fl9j the function 



m = -2>(t + = -£ i E^-W i, (3) 

which is a non-increasing function of time. Since E(t) attains a constant value only on fixed 
points and on cycles of length 2, these are the only attractors of the system. On the fixed 



points the function E coincides with the Hamiltonian of the SK model of Spin Glass f20 
H = — Jij&i&j, thus the fixed points of the symmetric networks are also metastable states 
for the SK model. The typical attractors are nevertheless cycles of length 2: nearly every 
initial point (in the limit N — > oo) converges to a cycle of length 2. For symmetry reasons, 
the value of H is in average zero in such cycles, but E has a low value. We observed that 
the lowest values of E are attained on the fixed points, which are reached with vanishing 
probability. Thus, under the point of view of relaxation, cycles of length 2 act as traps. 

Let us summarize our results about non-symmetric synaptic couplings. In this case, E(t) 
may increase and it is not anymore a Ljapunov function. Nevertheless, we saw that its 
average value is a non-increasing function also for asymmetric couplings. Cycles of every 
length may exist, but, if the asymmetry is small, the typical cycles are still cycles of length 
2. Each of them has an attraction basin of vanishing weight (so that the average weight of 
the attraction basins is zero), but the sum of such weights tends to 1 in the infinite size limit. 
This situation persists up to i] c « 0.33, where the sum of the weights of cycles of length 2 
suddenly drops to in the infinite size limit. The typical attractors are now very long cycles, 
whose length increases exponentially with system size (chaotic phase). The number of such 
cycles is much smaller than the number of cycles of length 1, but the average weight of their 
attraction basins is finite, like in the Random Map model pi| . Indeed, we argue that the 
distribution of the weights is the same as in a Random Map model with reversal symmetry, 
as in the limit case rj = []2"T|]f|. 

Between 77 = 0.5 and rj = 0.33 long cycles are still a negligible portion of phase space, but 
the average length is dominated by the tails of the distribution, and increases exponentially 
with system size, in agreement with the results of Niitzel flIU| . However, we prefer to place the 
transition between the two regimes at rj c = 0.33, where the nature of the typical attractors 
abruptly changes. We also measured the typical transient time (the time necessary to reach 
a cycle). It appears from our data, in contrast with previous numerical results, that the 
transient time increases exponentially with system size for 77 < 1, and as N 2 ^ 3 for 77 = 1. 
For 7] 1 there are two regimes: a power law at small N (roughly with the same exponent 
2/3 as in the case rj = 1) and an exponential increase for larger systems, with an exponent 
vanishing in the limit 77 — > 1. This behavior is in agreement with our theoretical expectations 



1 Another important model with symmetric couplings is the Hopfield model p8[ , the prototype of At- 
tractors Neural Networks, where the are given by the Hebbs rule, Jy — X^=i 

2 The case 77 = is in some sense peculiar, since the average number of attractors increases linearly with 
system size pf|, as in a Random Map and in contrast with the case r/ 7^ 0, where the number of fixed points 
increases exponentially with N H. 



(see below). 

The above observations hold for rj > 0. The case rj < 0, which corresponds to anti- 
symmetric couplings, is related in a simple way to the symmetric case in the infinite size 
limit, as we shall see. In this case the phase space is dominated by cycles of length 4 for 
rj < —0.33, while exponentially long cycles prevail for larger values of rj. The transition point 
is r] c = —0.33. 

We study, numerically and in part analytically, the closing probabilities, that express the 
probability that a trajectory not yet closed closes on a cycle of length I after a time t+l. The 
closing probabilities are nothing but the tail of the distribution of the correlation function 
q(t,t + I), thus they establish a link between the properties of the infinite system and the 
properties of the attractors |22|] . In this framework, the transition can be understood in this 
way: since the distribution of q(t, t + 2) is peaked around a much larger value than for any 
other value of I, the closing probability on cycles of length 2 is much larger than for any other 
cycle, for a factor exponentially large with system size. This effect decreases as rj decreases, 
and, at a critical parameter, it is not enough to balance the large number of possible long 
cycles, which then prevail. 

The paper is organized as follows: in section 2 we study analytically the properties of the 
attractors under some hypothesis on the closing probabilities. We derive a condition on the 
exponent of the closing probability that corresponds to the dynamical transition. We argue 
that the distribution of the weights of the attraction basins is the same as for a Random 
Map with reversal symmetry in the whole chaotic phase, while the typical weights tend to 
zero in the oscillatory phase. Thus also this quantity has a discontinuity at the transition. 
In section 3 we present our numerical results, reporting the properties of the attractors, the 
relaxation of E(t), the distribution of the correlation functions and the closing probabilities. 
In the final discussion we point out possible extensions of this study. 

2 Closing probabilities and attractors 

The natural distance in the configuration space of the system is the Hamming distance, that 
measures the number of elements in a different state in two configurations. An equivalent 
information is given by the correlation between configurations, or overlap. Let us consider 
the overlap between two configurations at different times along the same trajectory: 

?(M + = 4 E + 0- ( 4 ) 

i 

This is a random variable, depending a) on the quenched disorder (dynamical rules), and 
b) on the randomly chosen initial point. Knowing its distribution, we can reconstruct all the 
properties of the attractors. In particular, we have to compute the probability that q(t,t + l) 
is equal either to 1 or to -1. The first case corresponds to a trajectory that, after a transient 
time t, enters a limit cycle of length I. The case q(t, t + I) = — 1 corresponds to a trajectory 
that, after a transient, time t, enters a cycle of length 21. In fact, if C(t) has been reversed 
after a time I, also C(t + l) shall be reversed after a time I, since the map ([l|) commutes with 
the reversal operator 71 defined by TZ{ai, ■ ■ ■ a N } = {— <7i, ■ • • — ajy}, thus C(t + 21) will be 
equal to C{t). 



We are interested in the first time when a trajectory "closes" visiting a configuration 
already attained. Thus, we have to impose the condition that no configuration has yet been 
repeated before the time t + l. The closing probabilities are then conditional probabilities: 



n+(t,t>) = Pr{q(t,t') = l\ A t ,} , (5) 
n N {t,t') = Pi {q(t,t') = -1 | A t ,}, 

where the symbol A(t') represents the condition that it never happened, before time t', either 
q = 1 or q = —1. 

2.1 The overlap in an infinite system 

Taking this opening condition into account reconciles the apparent discrepancy between the 
behavior of a finite system [llj and of an infinite one Jjl|. In an infinite system, the average^ 



overlap after two time-steps, Q(t,t + 2) = q(t,t + 2), tends, as t — ► oo, to an asymptotic 
value that is an analytic function of rj. In contrast, in a finite system, if time is long enough 
a limit cycle is reached. Then Q(t,t + 2) tends to 1 with probability 1 in the oscillatory 
phase while it is less than 1 in the chaotic phase. 

Because of the opening condition we can use, as a first approximation, the properties of 
the correlation function in an infinite system to derive the properties of the attractors in a 
finite system. We list here some properties of the conditional distribution of q(t, t + l) which 
we expect to hold if the limit N — > oo is taken with t and I fixed. In what follows, the 
opening condition is always assumed to hold. 

We consider first rj > 0. Qualitatively, we expect that the behavior of the overlap at —rj 
is related in a simple way to the one at r\. A simple argument runs like this: all the couplings 
where JijJji is positive tend to align <Ji(t) and <Ti(t + l), so that for 77 > there is an effective 
ferromagnetic interaction between the time slice at time t and the time slice at time t + l, 
and an effective antiferromagnetic interaction for 77 < 0. These interactions have a different 
sign, but the same strength. They determine the main features of the overlap distribution: 
for instance, the variance and the non-vanishing value of Q(t, t + l) (for even I; for odd I the 
distribution is symmetric) 

1. The overlap distribution depends exponentially on system size: 

Pr{g(t, t + I) = q} « At '^ V) exp {-Na t>m {q; rj)) . (6) 



This is a kind of finite entropy statement, and it can be understood as follows: the 
overlap is the average of N terms. They are not independent, but their correlations 
are small enough so that the variance of the overlap is of order 1/N. Thus, in analogy 
with the Shannon-Mc Millan theorem, we expect 



3 Here, as it is usual in the theory of disordered systems, the over-line denotes the average respect to the 
disorder, while the angular brackets denote an average over the initial configurations that in this case is not 
needed p3[ . 

4 Another way to see this equation is to compute the probability of q(t, t+l) with the dynamical functional 
integral. 



2. Weak time translation invariance: in the limit t oo, the overlap q(t, t + l) converges 
to a well-defined limit distribution. We call 07(77) the limit value of a t ,t+i(q = 1 , 77) . 

3. Symmetry: for / odd, the distribution of q(t,t + I) is symmetric around q = 0. This 
can be easily proved using the inversion symmetry J — > — J of the distribution of the 
synaptic couplings. It implies that Q(t,t + I) = for I odd, and, more in general, 
&t,t+i(q;v) = <x t ,t+i(-q;v)- 

The variance of the distribution is larger than 1/N (the value for a binomial distribu- 
tion), since there is a positive correlation between ai(t)<Ji(t') and &j(t)aj{t') if Jij and 
Jji have the same sign. The closing probability is accordingly larger than 2/2 N . Both 
the variance and the closing probability decrease with I and increase with 77. 

4. For / even, the most probable overlap, Q(t,t + /), is positive and decreases with I 
(correlations decay in time). In the phase where there is not remanent magnetization 
(77 < 0.825) the limit value is zero, and q(t,t + /) reaches asymptotically a symmetric 
distribution (the limit of large t having been taken in advance). The closing probability 
consequently decreases with / and increases with 77. 

5. The limit distribution reached in the limit / — > 00 is found numerically to be the same 
both for even and odd /. We denote the exponent of the closing probability by the 
symbol a ti0 o(v) (since the limit distribution is symmetric, the exponent is the same 
both for closing with q — 1 and for closing with q = —1). 

As a consequence, the closing probability has a maximum for I — 2 at fixed t, and it is 
exponentially larger than for any other value of I. Thus cycles of length 2 are found with 
the highest frequency. 

In antisymmetric networks (77 < 0) the situation is slightly more complicate: we have to 
distinguish I odd, I = Am and I = 2 (2m +1). 

• For odd / the average overlap is zero and the variance is smaller than in a binomial 
distribution, since the correlation between <Ji(t)<Ji(t') and Cj{t)<Jj(t') is negative if 
and Jji have opposite sign. Thus the closing probability is smaller than in the case of 
a binomial distribution and increases as / increases. Accordingly, cycles of odd length 
are very rare in antisymmetric networks. 

• If / = 2(2m + 1) (/ = 2,6, 10, • • •) there is an effective antiferromagnetic interaction 
between times slices t and t + l. Thus Q(t, t + l) is negative and the trajectories close 
preferentially with q(t,t + 1) — — 1 producing cycles of length 21. 

• If I = 4m the effective interaction is ferromagnetic, and the behavior is the same as for 
positive rj. 

In the case of negative 77, we expect n^(t, t + l; —77) ps K]y(t, t + I; rj) for even 77, asymp- 
totically in N. Thus the dominant closing probability is Tc^(t, t + 2), and the most frequent 
cycles are cycles of length 4. Since the transition regarding the attractors is governed by the 
comparison between the closing probability with I = 2 and that with / = 00 (see below), the 



transition for negative 77 should take place at r]' c = —i] c . Simulations confirm very well these 
arguments, so that in most of what follows we limit our study to the case 77 > 0. 

In finite systems the situation is more complicate, and some of these properties do not 
hold when the times t and / are large respect to system size. For instance, in this case q(t, t+l) 
does not reach an asymptotic distribution in the limit t — > 00 : it is easy to see that its 
average value has to decrease at large t as a consequence of the opening condition. In fact, 
^N{t') = J2t n N(t,t') (integrated closing probability) can not exceed 1, since it represents the 
probability that a trajectory closes at time t'. Thus n N (t,t + I) must finally decrease with 
t and the conditional distribution of the overlap q(t, t + l) can not become really stationary 
in t. However, as N increases this effect becomes weaker and weaker. 

We are interested in the closure of the cycles, which takes place at a time-scale r exponen- 
tially increasing with N, and it is not a priori clear whether we can neglect these finite size 
effects or not. However, the predictions that can be drawn from this simplified description 
capture the essential features observed in the simulations. We derive in the following these 
predictions. 



2.2 Cycles of length 2 

We compute now the probability that a trajectory closes on a cycle of length 1 = 2. Let us 
consider 77 > 0. We want to show that, if a 2 (l',rj) < |<5oo(l;^), then all trajectories reach 
attractors of length 2 with probability 1 as N — > 00. We call this situation the oscillatory 
phase. 

It is very easy to go from the closing probabilities to the probability distribution of cycle 
lengths. The probability that the trajectory has not closed up to time t is given by 

P N (t) « exp ( - J2 E Mf - 1, J , (7) 
\ f =1 1=1 J 

so the time-scale with which this quantities decays gives the time-scale of typical closing 
events. 

We introduce the function /jv(£), which expresses the ratio between the probability that 
a cycle of length / 7^ 2 is reached at time t and the probability that a cycle of length 2 is 
reached at the same time: 

, f v 11*^:2 ^ N (t -I, t) 

M;ri)= Mt-2,t) • (8) 

In the above hypothesis, this is an increasing function of t when t is large and the overlap 
distributions have reached the asymptotic value. The probability that a cycle of length 2 is 
reached at time t, under the condition that a cycle was reached, is given by (1 + /#(£)) -1 . 
This is very high for t small and decreases with time, since more possible cycles enter the 
game. Let us compute this quantity at the time scale r 2 = e Na2 \ 

}n (r 2 ; v) ~ ex P (N(2a 2 - a^)) , (9) 

where a 2 is the exponent of the asymptotic closing probability for cycles of length 2. Two 
different situations occur: 



• a 2 < ctoo/2: oscillatory phase. is exponentially small up to times of order 
r 2 = e Na2 . But at such times the probability that a trajectory is not yet closed, -P/v(i), 
is very small: P/v(t) < exp(— t/r 2 ). Thus, for very large sizes N, almost all the cycles 
close before this time, and they are cycles of length 2. 

• «2 > aoo/2: chaotic phase. In this case, the time scale at which most of the trajectories 
are close is proportional to r M = e a °°^ 2 , since it holds P/v(t) < exp (— (t/Voo) 2 ). At this 
time, /a^Too) is already very small and the probability that at least a cycle of length 
2 is reached is vanishingly small with N. 

Since a 2 (77) tends to zero for 77 — > 1 (as we noted above, in this case the most probable 
value of the overlap is Q(t,t + 2) = 1 and the distribution is not exponential), while 
does not vanish in this limit (actually, for rj = 1 the value of is infinite, since only cycles 
of length 1 and 2 are found), and for 77 = the exponents a 2 and are roughly equal, it 
must exist a critical value of 77 at which the condition = 2a 2 is fulfilled. 

2.3 Transient times 

According to the above argument, the typical closing times are given by 

Tr oc exp (A^min(o;2, aoo/2)) , (10) 

and the argument predicts that transient time increases exponentially with system size, with 
an exponent that vanishes in the limit 77 — > 1. Close to this value, very large systems are 
needed in order to distinguish this behavior from a stretched exponential or a power law. 
For 77 = 1, the most likely value of q(t, t + 2) is q — 1 and the exponent a 2 vanishes. Thus the 
closing probability has to be computed before q(t, t + 2) reaches its stationary distribution. 
In the Gaussian approximation, 



ir N (t,t + 2; 77 = 1) oc 



exp 



(l-Q 2 (t;T7 = l)) s 
2NV 2 (t) 



exp 



(1-Q 2 (t;r7 = l)) ' 
2N 



(11) 

since, for Q 2 close to 1, the variance goes to zero as oc 1 — Q 2 . It is known from previous 
numerical studies [23| that 1 — Q 2 (t, 77 = 1) oc t 3 ^ 2 . Thus the time-scale at which trajectories 
close for 77 = 1 is a power-law of system size: 



Tr(7/ = 1) oc iV 2/3 . 



10 



(12) 
and in the 



This prediction is in good agreement with the numerical results in Niitzel 
present work. When 77 is close to 1, there are two regimes: at small N, the trajectories close 
when the overlap distribution is not yet stationary. Since Q(t, t + 2) tends to its stationary 
value Q^iv) as t~ 2 ^ 3 independent of 77 [T2], [T(J, we expect and observe also in this case that 
Tr oc iV 2/3 . At lar ger size the overlap becomes stationary by the time when cycles close, and 
the exponential dependence shows up, with an exponent that, according to the Gaussian 
approximation, is given by 



*M « i^fM. (is) 

Numerically, it is found a 2 (?7) oc (1 — r?) 2 . 
2.4 Average length of the cycles 

At the time t oc exp(A^a 2 ) when typical cycles of length 2 close the probability that a long 
cycle close goes as exp ((a 2 — a^N) and the average length of cycles longer than 2 grows 
as most as exp ((3ct 2 — a^N). Thus, for a 2 < ctoo/3, the typical length of the cycles and 
the average length coincide, and they are equal to 2. At lower symmetry very large cycles 
appear with a probability that, though vanishing, is large enough to let them dominate the 
average length. At this point the average length of the cycles increases exponentially with 
system size. This change takes place inside the oscillatory phase, in which the typical cycles 
have length 2. We prefer to use the probability P(2) to find a cycle of length 2 as the order 
parameter to describe the transition, instead of using the logarithm of the average length of 
the cycles, which is dominated by the tails of the distribution. 

Our simulations show that the transition defined by P(2) happens at r\ c « 0.33, a value 
much smaller than the threshold where the remanent magnetization vanishes, 7]rm = 0.825, 
and also smaller than the threshold at which the average length of the cycles starts to increase 
exponentially, r\ L = 0.50 [TO]. 



2.5 Weights of the attraction basins 

Another quantity that can be used as an order parameter is the average weight of the 
attraction basins. The weight W a of the attraction basin of cycle a represents the probability 
that a random chosen configuration reaches ultimately this cycle. The {W^j's are random 
variables, depending on the realization of the couplings. It is convenient to define the 



moments of their distribution 26 



Y n = Y,WS- (14) 

a 

Y\ is equal to 1 because of the normalization. The average weight of the attraction 
basins, Y 2 , is equal to the probability that two randomly chosen trajectories reach the same 
attractor p^j . Let us evaluate such a probability. We define ir^\t,t') as the probability 
that the configuration at time t on the first trajectory is equal to the configuration at 
time t' on the second trajectory, given that the initial configurations are chosen at random. 
This is, asymptotically in t, equal to exp(— A r a 00 ), if the remanent magnetization is zero, 
or smaller, otherwise. In fact, we expect that the overlap of two independent trajectories 
is asymptotically equal to the overlap between configurations at long time distance along 
the same trajectories, if the dynamics looses memory of the initial state (this requires that 
the remanent magnetization vanishes), and is smaller in the case in which the memory of 
the initial state is not lost. Predictions based on this expectation are well satisfied in other 
disordered dynamical systems like Kauffman networks [^] and discretized chaotic maps |37| . 



Thus we expect that the time when two trajectories in the same attraction basin even- 
tually meet grows as exp(— Nctoo/2). In the oscillatory phase this is much larger than the 
time scale over which a trajectory reaches a cycle, exp(— iVa 2 ), so that we expect that the 
probability that two trajectories meet before reaching their limit cycles tends to zero. This 
is confirmed by the simulations. 

In the chaotic phase the two time-scales are equal, and we expect no n- vanishing Y n . 
Moreover, we expect that their values coincide with the moments of the weights of the 
attraction basins of a Random Map model , endowed with reversal symmetry, like in the 



previously studied case 77 = pT[ . The computation follows exactly the same steps as in 
that case: in fact, the Random Map distribution follows only from the facts that the closing 
probabilities iTN(t,t + I) tend to an asymptotic limit independent on both t and /, and that 
the "transient" regime (small /) does not contribute significantly to the closure of the cycles. 



The result of the computation is f21fl : 



(n\) 2 



Y n+1 = - v 1 . (4 n + 2 n ). (15) 
n+1 2(2n + l)! v ; y J 

For instance, Y2 = 1/2, Y3 = 1/3, I4 = 9/35... Our numerical results are consistent with 
this prediction (see Fig. |l|), although, for 77 > 0.1, Y 2 is still far from the asymptotic regime 
for all the systems that we could simulate. 



2.6 Average number of cycles 

We conclude the discussion of the properties of the attractors showing that the closing 
probabilities allow to compute the average number of cycles of length I through the relation 

W c (l) = 2 N Qtt+(0, I) + jn N (0, Z/2)) (16) 

(the last term is present only for even I). Thus the number of cycles varies exponentially with 
N. For odd I and 77 > 0, otoi(q = 1) is smaller than In 2 (in fact, there is a positive correlation 
between <7j(0)<7j(Z) and <jj(0)aj(l) if and Jji have the same sign. The correlation change 
of sign in case of negative 77, and the number of odd cycles becomes exponentially small). 
For even I = 2m, c%(l) is also smaller than In 2 (in fact, the average value of <7j(0)<7j(/) is 
different from zero). The two cases are related: for instance, the number of cycles of length 
2 increase as the square of the number of cycles of length 1 |J. With our notation, this 
relation reads aoi — cto2 = In 2. 

For large I, no matter if even or odd, if the remanent magnetization is equal to zero, 
the distribution of q(0,l) tends to a binomial distribution with average value 1/2. Thus 
the average number of cycles of length I tends to 1 for large I. A funny consequence of a 
non-vanishing remanent magnetization is that in this case the number of cycles of length I 
increases exponentially with a non- vanishing exponent for every value of /. 



3 Numerical results 
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Figure 1: (a): Probability to find a cycle of length different from 2, and (b): Average weight 
of the attraction basins, as a function system size. From top to bottom, the values of the 
symmetry is rj = 0, 0.1, 0.2, 0.3, 0.32 (squares), 0.34 (stars), 0.4, 0.48, 0.5, 0.52, 0.56, 0.6. 
The error bars are, in the worst cases, comparable with system size. 



3.1 Properties of the attractors 

In order to monitor the transition from short cycles to long attractors, we measured 5 
quantities: the probability that a cycle of length 2 is reached, -P(2), the average weight of 
the attraction basins (equation H), the average length of the cycles and of the transient 
time, and the distribution of cycle length. 

The probability to reach a cycle of length 2, P 2 , initially decreases with system size. For 
rj > 0.34, it reaches a minimum and starts to increase, going asymptotically to the value 1. 
For 77 < 0.32 it seems that it always decreases with system size, apparently going to zero 
(though we can not exclude that it starts to increase at a size larger than the ones that we 
could simulate). Thus we place the transition at r] c pa 0.33 (see Fig. [Lja), remarking that the 
threshold could be overestimated. 

The average weight of the attraction basins has also a non-monotonic behavior with 
system size, but it leads to a different systematic error: it starts decreasing with system 
size, then reaches a minimum and eventually increases for rj < 0.32 , probably tending to 
the value 0.5 typical of a Random Map with reversal symmetry (dashed line in Fig. [l|b). 
For r] > 0.34 Y2 apparently tends to vanish, although we can not exclude that it starts to 
increase at a larger size. Thus also from this measure we estimate r] c pa 0.33, but this time 
the threshold could be underestimated. From this measurement and the previous one, we 
conclude then that rj c = 0.33 ± 0.01. 

We fitted the average length of the cycles and the average transient time with an expo- 




Figure 2: Exponent of the average cycle length (full line) and of the average transient time 
(dashed line) as a function of the asymmetry. 



nential function of system size, L ~ exp (0^(77)^) and T ~ exp (a,j>{ri)N) respectively. In 
the first case, the fit is good for 77 < 0.5, and the exponent 0^(77) vanishes at this point. For 
larger symmetry, the average length of the cycles tends to 2, but not in a monotonic way: 
it starts increasing, reaches a maximum and then decreases. Thus we can not exclude that 
the exponential increase of the cycles stops at a smaller value of the symmetry parameter if 
larger systems are considered. 

The average transient time increases as exp (Na>T{r))) for |?7| < 1, and «t(?7) vanishes at 
77 = 1 as (1 — ^) 2 ' 0±0 - 1 ) as expected. At 77 = 1 the transients grow as a power law of system 
size, Tr oc N b , with b = 0.66 ± 0.01, in agreement with the prediction of equation [1^ and 
the previous numerical work by Niitzel [10|. At 77 = 0.9 we found two regimes: for iV < 512, 
the transient time increases as a power law with exponent b = 0.66 as in the system with 
7/ = 1. At 77 > 1000 the transient time increases faster with N. The analysis of the closing 
probability, which is a less noisy measure, allow one to conclude that the behavior with TV 
is, asymptotically, exponential with a very small exponent. The first regime reflects the time 
needed to reach the asymptotic distribution of the overlap, as discussed in section 2 for the 
case rj — 1, as predicted in the previous section. 

We plot in Fig. |^ the exponents a^rj) and &t{v)i f° r ?7 > 0. We checked for r\ = —0.4 
that, asymptotically in system size, it holds rj) « 0^(77) and qzt(— rj) ~ ariv)- 

The distributions of the lengths of the cycles have different features in three different 
regions. As a general rule, we have to consider separately cycles of even length and cycles 
of odd length, not only because the first ones can be obtained in two different ways (they 
close with q(t, t + l) = 1 and q(t,t + l/2) = —1) but also because the closing probability that 
q(t, t + l) = 1 is larger for even /. That's what we observed: 
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Figure 3: Probability distribution of the rescaled length of the cycles in the chaotic phase 
(rj = 0.2). iV = 32, 48 and 64. Systems of different sizes have been rescaled. 

1- V < Vc- i n this region the distribution can be divided in two parts: a power-law tail 
for short cycles, and an almost exponential tail for long cycles. In this part of the 
distribution, we found that the rescaled variable Lf exp(aL(r])N) has a well-defined 
limit distribution when N increases. 

As an example, we show in Fig. [3] three systems of different sizes with r\ = 0.20. 

2. T) c < rj < 0.5: in this region, the distribution is a power-law with exponent smaller than 
2. The average value is dominated by the tail of the distribution. The scaling with the 
average period does not hold. As an example, we show three systems of different sizes 
with 7] = 0.48 (figure ga). 

3. 7] > 0.5: in this case, the average period tends to 2. The distributions for the even 
cycles have approximately the shape of a power-law with an exponent larger than 
2 and increasing with system size. The distribution of cycles of odd length is also 
approximately a power-law, with an exponent that does not decrease with system size, 
but its total weight goes to zero as N —>■ oo. As an example, we show in Fig. |]b four 
systems with r] = 0.60. 



3.2 Relaxation of the energy 

The function E(t) = N~ l Y^i \hi(t)\, where hi(t) is the local field experienced by neuron i at 
time t, is a Ljapunov function for the symmetric system (rj = 1). In asymmetric networks, 
its average value is still a non-increasing function of time, even if it may increase in some 
realizations. 
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Figure 4: Probability distribution of the length of the cycles in the oscillatory phase: (a) 
r] = 0.48, (b) r] = 0.60. Odd and even lengths are shown separately. iV = 32, 48 and 64 
(from top to bottom). 



We show in Fig. |5]a E(t) for N = 256 and several values of rj between and 1. We 
imposed the condition that the trajectory is not yet closed when the energy is measured. 
Without this condition, at high symmetry the trajectories ultimately find cycles of length 2 
and the energy reaches a stationary value corresponding to the average energy of such cycles. 
With the opening condition the energy decreases to lower values. The effect of the opening 
condition is very small (not even significant) at small 77 and increases as 77 grows. 

At rj = the energy density is constant in time, and is equal to the expectation value of 
the module of a Gaussian variable, E(r] = 0) = \j2/n = 0.798. At larger 77, the asymptotic 
value of the energy is lower and is attained later in time. In Fig. || we show the relaxation 
of the energy for different symmetries, for systems of size N = 256. 

At high symmetry, we observed an interesting phenomenon: the average energy of the 
fixed points, that are reached with vanishing probability as iV — > 00, is significantly lower 
than the energy of the typical cycles of length 2. In Fig. ^a we show, as a function of 77, 
the infinite size extrapolation of the energy in cycles of length 1 and of length 2 respectively 
(the extrapolation was made using N~ 1 ^ 2 as finite size scaling, which gives very good fits). 
It can be seen that the difference increases with 77. At 77 = 1 the energy of cycles of length 
2, which is also the typical energy of the parallel dynamics, is slightly higher than the one 
computed in [29| with a Monte Carlo simulation of the infinite size dynamics. The energy of 
the fixed point, on the other hand, is much lower, and its extrapolated value, —1.55 ± .01, 
could be even lower than the zero temperature energy of the SK model, E = —1.526. This 
difference seems to reflect a more general tendency: the cycles whose length is odd have a 
lower energy than cycles of even length (figure 0b). This effect can be due to the fact that 
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Figure 5: E(t) on trajectories not yet closed for N = 256 and rj respectively equal to 0, 0.32, 
0.50,0.60, 0.80, 0.90 and 1. The energy is a decreasing function of rj. 

cycles of even length are more rapidly found than cycles of odd length (the typical transient 
times are shorter, as we expect from the closing probabilities), so the energy has more time 
to decrease. 



3.3 Overlap and closing probabilities 

Though it is not the main point of this work, we present here some numerical results about 
the distribution of the overlap, measured on trajectories not yet closed (opening condition). 
All our data in this section refer exclusively to the distribution of q(t,t + l) subjected to such 
a condition. 

The first figure that we show refers to the mean value and to the variance of the distri- 
bution of q(t, t + I). We show these quantities as a function of t for different values of I. As 
we noted in section 2, the average overlap is zero due to symmetry when the time difference 
I is odd, so we show its value only for even I. For rj > 0, it is always a non- decreasing 
function of t, and it reaches soon an asymptotic value (figure 0a). The fact that Q(t,t + I) 
is non-decreasing means that it is more and more difficult to lose the memory of the con- 
figuration as time increases. The asymptotic value, Q*[, is a decreasing function of I. We 
show its relaxation for different values of the asymmetry in Fig. [|a. It is evident from the 
figure that the relaxation becomes slower and slower as rj increases. Pfenning, Rieger and 
Schreckenberg |J and Eissfeller and Opper |TJ] observed a transition air] — 0.825 between a 



regime at high symmetry where Q(0, 1) relaxes as a power-law to a non-vanishing limit value, 
Q(0, 1) ~ Qoo + Al~ a , and a regime at low symmetry where the relaxation is exponential and 
the remanent magnetization vanishes, Q(0, 1) oc exp(— al). We expect to observe the same 
transition for the asymptotic value of Q(t,t + but our data do not allow us to verify this 

5 We note that, due to the opening condition, this quantity has a meaning different from the usual 
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Figure 6: (a) E in cycles of length 1 and 2, as a function of 77. The data are obtained 
extrapolating to the infinite size through the fit E N = E QO + AN' 1 / 2 , (b) E function 
of cycle length / for systems with N = 64 and different symmetries. For large symmetry 
there are no data for odd I, (except 1 = 1) since such cycles are met very rarely. For small 
symmetry the small cycles are almost never found, and the error is very large. 

point. 

Figure 0b shows the variance of q(t, t + l) as a function of t for different values of I. The 
variance is always an increasing function of t, and reaches an asymptotic value as t increases. 
The cases of odd I and even / have to be distinguished. The variance is larger for / = 2m + 1, 
and decreases as a function of m. For I = 2m the variance is smaller, and increases as a 
function of m. The asymptotic value of V(t,t + I) is shown as a function of / (for even 
and odd I separately) in Fig. [8|b Since odd and even variances have an opposite behavior 
as a function of I, the function V(t\ I) = (V(t,t + 1) + V(t,t + 1 + 1)) /2 (/ odd) shows a 
very small dependence on I. This dependence is however systematic: V(t; I) is an increasing 
function of I when t is large, and decreasing when t is small. In particular, V(0; I) ~ 1 (the 
totally random case) for every value of rj. We also verified that, for rj ~ 1 and t large, it 
holds V(t, t + 2) oc 1 — Q(t, t + 2), relation that we used in section 2. 

The closing probability n N (t, t + 2) also increases as a function of t to an asymptotic value 
7r^-(2; 77) oc e~ Na2 ^ . We plot in Fig. ||a the behavior of (vr^(2; rj)) 1 ^ as a function of 77 for 
systems of different sizes. It appears that it converges very slowly to a function independent 
of N, which is an even function of 77, has a cusp in 77 = and is concave downward. The 
limit value for large iV is 1 for 77 = 1, because vr^(2; 77 = 1) decreases as a power-law of N, 
and is less than 1 for 77 < 1, indicating that the exponential scaling is fulfilled for rj < 1. 

Before going deeper inside the analysis of the closing probabilities, we note that all 
the quantities that we observed, say Q N {t,t + I), V^it^t + I) and iTN(t,t + I), reach only 
approximately an asymptotic value in t. In reality, all these quantities must decrease with 
t, at least for time scales of the order of the inverse of the closing probability (see section 
2). This is observed in the simulations. However, the discussion presented above, based on 
stationary closing probabilities, is not modified essentially by this fact, since the decrease of 



Edwards- Anderson order parameter, which measures the size of an asymptotic state of the dynamics. 
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Figure 7: Average value Q(t, t + 1) (a) and variance NV(t, t + 1) (b) of the overlap q(t, t + 1) 
over trajectories not yet closed. Here i] = 0.6 and N = 128. The different curves are for 
different values of I. In the first figure, only even values of I between 2 and 30 are shown, 
from top to bottom. In the second figure the lower bundle of curves corresponds to even 
values of I (and V(t, t + I) is an increasing function of I), the higher bundle corresponds to 
odd / (and V(t, t + I) is a decreasing function of I). 




Figure 8: Large time value of Q(t,t + l) (a) and NV(t,t + l) (b) as a function of I for N = 192 
and T] = 0, 0.2, 0.32, 0.5, 0.7, 0.8, 0.9, 1. In the first figure only even values of I are shown 
and rj grows from bottom to top. In the second figure even values of I are represented as a 
dashed line (77 grows from bottom to top) and odd values as a solid one (77 grows from top 
to bottom for I = 2). At 77 = there is no difference between even and odd I. 
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Figure 9: Closing probability for cycles of length 2 as a function of r\ for systems of different 
sizes (a) and as a function of cycle length I, for N = 64 and rj = 0.32 and 0.56 (from bottom 
to top). 

the closing probability becomes slower and slower when system size increases: we find, for 
very large £, and for I = 2, 7Tj\r(t, t + 2) oc t~ bN , with 6^ oc N~^ 1+e \ Thus TTj^(t,t + I) can 
be considered constant even on time scales exponentially increasing with N, like the ones 
involved in the closure of the cycles. 

We then report the asymptotic closing probability it* N {l\ri) as a function of I for two 
different values of rj. The characteristic oscillations for even-odd I show up. Apart for that, 
the closing probability is a decreasing function of I. It is not clear whether it reaches a 
stationary value, exp(— Na^irf)), when I increases. It is very difficult to measure the closing 
probability for large I. In any case, we find that, contrarily to our hypothesis, the value of 
I at which tt^(1; rj) seems to become stationary increases with N. Thus the exponent 000(77) 
seems not to exist, and in any case it can not be determined numerically. 

4 Discussion 



It is known that the deterministic dynamics of asymmetric neural networks exhibits complex 
features. Among these features, we investigated here the transition between cycles of length 
2 and "chaotic" (exponentially long) attractors, first reported in ||10| . We found out that, 
though the average cycle length shows an abrupt change at rji = 0.5, the region where cycles 
of length 2 are stable extends up to rj c = 0.33 ±0.01. In this region nearly all the trajectories 
end up in cycles of length 2. The number of these trajectories is exponentially high and the 
weight of their attraction basins is vanishingly small. At lower symmetries the probability 
to find a cycle of length 2 drops abruptly to zero (though their number is still exponentially 



high). The typical attractors are now cycles exponentially long with system size, whose 
number increases only proportionally to N. The weights of the attraction basins seem to 
have the same distribution as in a Random Map model with reversal symmetry pi| , |2"T] . The 
typical times after which attractors are reached, on the other hand, vary exponentially with 
system size for every finite asymmetry, and vary as a power-law, Tr oc iV 2//3 , for \r)\ = 1. 

All these features can be predicted with arguments relying, through the closing prob- 
abilities, on the distribution of the overlap in an infinite system, but the point where the 
transition takes place can not be computed without a more detailed knowledge of the overlap 
distribution. However, this transition does not correspond to a transition in the infinite sys- 
tem. Asymmetric neural networks do present such a transition from a phase at low symmetry 
where the remanent magnetization vanishes, Qoo = lim^oo Q(0, 1) = (loss of memory) and 
a phase at high symmetry where the remanent magnetization is finite. This takes place at 



Vrm = 0.825 m |12| , which is a value much larger than r] c = 0.33, at which cycles of length 
2 cease to be the typical attractors. 

We can explain qualitatively the inclination of the trajectories to close on cycles of length 
2 as a consequence of the parallel updating and of the symmetry of the interaction. Both 
these elements conspire to create an effective ferromagnetic interaction between the spin 
<7j(t) and the same spin two time steps later (for negative rj the effective interaction is 
antiferromagnetic, and it tends to reverse the spins after two time steps, thus resulting in 
cycles of length 4). The transition takes place when the sum of the closing probabilities of 
long cycles balances the one of cycles of length 2. This balance involves both the number 
of cycles and the weights of their attraction basins. Cycles of length 2 are much more 
than cycles of any other length at every value of rj > (their number grows exponentially 
with system size with an exponent which is the double of the same exponent for cycles of 
length 1, H), but their attraction basins are vanishingly small, and, at a certain point, very 
long cycles, whose number grows only linearly with system size, represent the overwhelming 
majority of phase space. 

The dynamics of the system is a kind of relaxation, the function E(t) defined in (||]) 
playing the role of an "energy". This analogy is exact in the symmetric system, where 
E(t) decreases at every time step until a cycle is reached. In the asymmetric systems E(t) 
decreases in average, but not in every realization. The asymmetry introduces something 
similar to thermal noise in the dynamics: the average asymptotic value of E increases when 
decreasing r]. Fixed points are states of low energy, but they are very difficult to reach 
because of the competition of higher energy attractors (either cycles of length 2 or very long 
cycles), which are easier to reach. 

We can ask ourselves what changes in the above description if thermal noise is introduced. 
Of course thermal noise destroys the limit cycles which are produced by the deterministic 
dynamics, but at low temperature some metastable states reminiscent of the cycles of length 
2 may still survive. This discussion may be put on precise basis in symmetric networks 
with rj = 1. In this case detailed balance is fulfilled with a suitable definition of the noise, 
and at equilibrium the statistical state of the system is described by a kind of Boltzmann 
distribution, for which the fluctuation-dissipation theorem holds: 



N 

Pr(a 1 ---a J v)ocn(e /3hl +e-^), (17) 



where hi = J2j Jij a j 1S the local field experienced by spin i JT^]. This statistical description 
does not hold anymore for asymmetric networks, for which detailed balance breaks down. 
Ferraro [28| and Scharnagl et al. |2{| computed the asymptotic value of the energy, which 



in this model is defined as (hi tanh(/3/ij)), and at (3 = oo coincides with definition |J| for the 



system with 77 = 1. They used the Monte-Carlo scheme of [17]], which gives results free of 
finite size effects (even if for computational reasons there is a limit of t ~ 100 on the time 
steps that can be reliably performed). At temperatures larger than 0.6 the energy follows 



the theoretical prediction for the SK model with remarkable accuracy, as expected in |3Tj 



but at lower temperatures the energy is considerably higher than what expected for the SK 
model [^8[ |29fl . This may due to the fact that the system remains trapped in metastable 
states corresponding to energy minima that at T = are cycles of length 2. As we observed, 
the energy of these states is significantly higher than the energy of the fixed points (which 
are low energy states for the SK model) and their attraction basins cover nearly all of phase 
space. We did not do direct simulations to test this interpretation, nor to see whether these 
states are also found for asymmetric couplings up to some critical value of the asymmetry, 
but we think that this could be an interesting issue. 

Asymmetric networks at finite temperature were recently simulated in |14j], though with 



Langevin dynamics whose equilibrium distribution, for r\ — 1, is given by the SK model. 
The authors found aging effects and non-trivial overlap distributions at small but finite 
asymmetry. We agree with them about the absence of aging for rj < 0.8: for these systems, 
the average overlap Q(t, t + l), subject to the opening conditions, reaches for every I a time- 
translation-invariant state, where it does not depend anymore on t. Our data suggest that 
this could not hold anymore at larger values of 7] (and it can be speculated that the threshold 
coincides with rjRM = 0.825 at which, where the remanent magnetization is different from 
zero), but the time- window that we analyzed is to small to state anything definite about this 
point. 

The results that we present here have also to be compared with a recent investigation of 
the relaxation dynamics of the SK model at T = f31fl . Using different kinds of dynamics, 



defined as reluctant, sequential and greedy, it was found, among other things, that the typical 
fixed points reached have different energy densities for the different dynamics (E g = —1.416, 
Eg = —1.430 and E r = —1.492). The energy density that we found for r] — 1, extrapolated 
to the infinite size, is higher in cycles of length 2 (E 2 = —1.399), but it is lowest on the fixed 
points (Ei = —1.55 ± .01), and it could be even lower than the zero-temperature energy of 
the SK model E = -1.526. 

The transition that we investigated presents some features similar to the one taking 
place in Kauffman networks, a disordered dynamical system proposed as a model of genetic 
regulation |16| . Also in that case the average length of the attractors increases exponentially 



in the chaotic phase, where the weights of the attraction basins follow the Random Map 
distribution [p2| , and do not depend on system size in the frozen phase. Nevertheless, 



despite the similarity between the chaotic phases of the two models, the frozen phases are 
quite different. In the frozen phase of Kauffman model the number of attractors has a 



finite limit as system size increases, the average weight of the attractors does not vanish 
and the transient time is also finite. Moreover, in the infinite size limit a phase transition 
corresponding to the transition for the attractors takes place, between a phase without 



damage spreading and a phase where a small damage propagates to the whole system [32 



None of these features are present in asymmetric neural networks. While the transition in 
Kauffman networks is a consequence of its finite connectivity (every element receives inputs 
from exactly K elements), which in turn implies that, in the frozen phase, only a finite 
number of elements are relevant for the dynamics |33], |33j], asymmetric neural networks are 
a system with infinite connectivity, and their oscillatory phase shows much less order than 
the frozen phase of Kauffman networks. 

Though being a system with a finite number of states, this system shows for rj = and 



in the infinite size some relevant features of chaos in continuous systems QT3 1 . At 77 



on the other hand, the system never loses memory of its initial condition. At intermediate 
symmetries the system shows features that are "chaotic" and features that are "ordered". 
The analogy can be carried out also through the study of the attractors. In discretized 
chaotic systems "artificial" limit cycles are present due to the finiteness of phase space. The 
attraction basins of these cycles follow a Random Map statistics p7|, which is in agreement 
to what is observed here for 77 < 0.33. The closing time of these cycles increases as e~ D2 ^ 2 
35|, I36L EjJ , where e is the discretization and D 2 is the correlation dimension |37|. This 



is also analogous to what is observed in the present model for rj < 0.33 if we identify 2~ N 
with e D and aoo/ln2 with D 2 /D (the fact that, even in the most "chaotic case" rj = 0, we 
find ctoo < In 2 can be interpreted as if in this model the correlation dimension were always 
less than the dimension D of the embedding space). But for 77 > 0.33 the situation is less 
clear: still the closing time increases exponentially with system size, as if there were a finite 
"correlation dimension" of the asymptotic configurations equal to a 2 / In 2, but the length 
of the cycles does not increase with N and the statistic of the attraction basins is not of 
the Random Map type. Moreover, the number of cycles increases exponentially with N. 
Thus neither the analogy with a discretized chaotic system, nor the analogy with a periodic 
system hold. 
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